Goto

Collaborating Authors

 search engine


27aa3aeff0f8460a7b43d30fa6c5c032-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing Systems

Large Language Models (LLMs) are transforming search engines into Conversational Search Engines (CSE). Consequently, Search Engine Optimization (SEO) is being shifted into Conversational Search Engine Optimization (C-SEO). We are beginning to see dedicated C-SEO methods for modifying web documents to increase their visibility in CSE responses. However, they are often tested only for a limited breadth of application domains; we do not know whether certain C-SEO methods would be effective for a broad range of domains. Moreover, existing evaluations consider only a single-actor scenario where only one web document adopts a C-SEO method; in reality, multiple players are likely to competitively adopt the cutting-edge C-SEO techniques, drawing an analogy from the dynamics we have seen in SEO.


A German Court Has Ruled That Google Is Liable for False Statements Generated by AI Overviews

WIRED

The ruling holds that a company that designs, trains, operates, and manages an AI system must assume legal liability for any damages caused by the responses it generates. A local court in Germany has issued a ruling that could reshape the operation of search engines and artificial-intelligence-based chatbots worldwide. The Munich Regional Court preliminarily ruled that Google is liable for a series of false statements generated by its AI Overviews feature, requiring the company to prevent the dissemination of erroneous or inaccurate claims through its search engine. The ruling stems from a case first reported by the Decoder, in which two publishers discovered that Google's AI-generated summaries linked them, in certain searches, to questionable business practices, scams, and subscription-related frauds, without any basis for doing so. Earlier this year, the affected companies sent the tech giant a cease-and-desist letter, according to the report.


VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Neural Information Processing Systems

Effectively retrieving, reasoning and understanding visually rich information remains a challenge for traditional Retrieval-Augmented Generation (RAG) methods. On the one hand, traditional text-based methods cannot handle visual-related information. On the other hand, current vision-based RAG approaches are often limited by fixed pipelines and frequently struggle to reason effectively due to the insufficient activation of the fundamental capabilities of models. As reinforcement learning (RL) has been proven to be beneficial for model reasoning, we introduce VRAG-RL, a novel RL framework tailored for complex reasoning across visually rich information. With this framework, VLMs interact with search engines, autonomously sampling single-turn or multi-turn reasoning trajectories with the help of visual perception tokens and undergoing continual optimization based on these samples.


Google Search Goes Agentic--and Doesn't Need You Anymore

WIRED

Instead of clicking on a bunch of random website links, I was reading an AI summary positioned at the top of my search results and sometimes clicking through to double-check the accuracy of the output. The next evolution of Search that Google is building asks for even less active participation from users. You're really the most involved at the start of the journey, and that's it. You tell the agents what you want to know, and they do the clicking and even calling on your behalf. Rather than you going off on some online adventure, it's the agent that's hoovering up anything it can find and bouncing between different sites.


Toolformer: Language Models Can Teach Themselves to Use Tools

Neural Information Processing Systems

Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller specialized models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q&A system, a search engine, a translation system, and a calendar. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.


ALarge Scale Search Dataset for Unbiased Learning to Rank

Neural Information Processing Systems

The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms. However, promising results on the existing benchmark datasets may not be extended to the practical scenario due to some limitations of existing datasets. First, their semantic feature extractions are outdated while state-of-the-art large-scale pre-trained language models like BERT cannot be utilized due to the lack of original text. Second, display features are incomplete; thus in-depth study on ULTR is impossible such as the displayed abstract for analyzing the click necessary bias. Third, synthetic user feedback has been adopted by most existing datasets and real-world user feedback is greatly missing. To overcome these disadvantages, we introduce the Baidu-ULTR dataset. It involves randomly sampled 1.2 billion searching sessions and 7,008 expert annotated queries (397,572 query document pairs).


Bing is the anti-AI search engine you should be using

PCWorld

PCWorld argues that Bing serves as a superior alternative to AI-heavy search engines by prioritizing human-authored content over automated summaries. AI search engines like Google's AI Mode often hide original sources and provide misleading information, with traffic to publishers dropping significantly.


Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package

arXiv.org Machine Learning

Background: Determining an adequate sample size is essential for developing reliable and generalisable clinical prediction models, yet practical guidance on selecting appropriate methods remains limited. Existing analytical and simulation-based approaches often rely on restrictive assumptions and focus on mean-based criteria. We present and validate pmsims, an R package that uses Gaussian process surrogate modelling to provide a flexible and computationally efficient simulation-based framework for sample size determination across diverse prediction settings. Methods: We conducted a comprehensive simulation study with two aims. First, we compared three search engines implemented in pmsims: a Gaussian process-based adaptive method, a deterministic bisection method, and a hybrid approach, across binary, continuous, and survival outcomes. Second, we benchmarked the best-performing pmsims engine against existing analytical (pmsampsize) and simulation-based (samplesizedev) methods, evaluating recommended sample sizes, computational time, and achieved performance on large independent validation datasets. Results: The Gaussian process-based method consistently produced the most stable sample size estimates, particularly in low-signal, high-dimensional settings. In benchmarking, pmsims achieved performance close to prespecified targets across all outcome types, matching simulation-based approaches and outperforming analytical methods in more challenging scenarios. Conclusions: pmsims provides an efficient and flexible framework for principled sample size planning in clinical prediction modelling, requiring fewer model evaluations than non-adaptive simulation approaches.


An engine not a camera: Measuring performative power of online search

Neural Information Processing Systems

The power of digital platforms is at the center of major ongoing policy and regulatory efforts. To advance existing debates, we designed and executed an experiment to measure the performative power of online search providers. Instantiated in our setting, performative power quantifies the ability of a search engine to steer web traffic by rearranging results. To operationalize this definition we developed a browser extension that performs unassuming randomized experiments in the background. These randomized experiments emulate updates to the search algorithm and identify the causal effect of different content arrangements on clicks. Analyzing tens of thousands of clicks, we discuss what our robust quantitative findings say about the power of online search engines, using the Google Shopping antitrust investigation as a case study. More broadly, we envision our work to serve as a blueprint for how the recent definition of performative power can help integrate quantitative insights from online experiments with future investigations into the economic power of digital platforms.


The Search Engine for OnlyFans Models Who Look Like Your Crush

WIRED

Presearch's "Doppelgänger" is trying to help people discover adult creators rather than use nonconsensual deepfakes. For three days in February, porn star Alix Lynx flew to Miami for her first exclusive creator gathering where she was in full grind mode: shooting Reels and talking strategy with other creators. "It was kind of like SoHo House for OnlyFans girls," she says of the experience, which is called The Circle and drew more than a dozen sex workers, including Remy LaCroix and Forrest Smith. Lynx, who is a former webcam model turned OnlyFans starlet, has a combined 2 million followers across Instagram, TikTok, and X . She joined OnlyFans in 2017 with "the luxury of having my own following," she says, but those numbers haven't always translated to subscriptions. It's why she was in Miami.